Robust Value Function Approximation by Working Backwards
نویسندگان
چکیده
In this paper, we examine the intuition that TD( ) is meant to operate by approximating asynchronous value iteration. We note that on the important class of discrete acyclic stochastic tasks, value iteration is ine cient compared with the DAG-SP algorithm, which essentially performs only one sweep instead of many by working backwards from the goal. The question we address in this paper is whether there is an analogous algorithm that can be used in large stochastic state spaces requiring function approximation. We present such an algorithm, analyze it, and give comparative results to TD on several domains.
منابع مشابه
Function Approximation Approach for Robust Adaptive Control of Flexible joint Robots
This paper is concerned with the problem of designing a robust adaptive controller for flexible joint robots (FJR). Under the assumption of weak joint elasticity, FJR is firstly modeled and converted into singular perturbation form. The control law consists of a FAT-based adaptive control strategy and a simple correction term. The first term of the controller is used to stability of the slow dy...
متن کاملLearning Evaluation Functions for Large Acyclic Domains
Some of the most successful recent applications of reinforcement learning have used neural networks and the TD( ) algorithm to learn evaluation functions. In this paper, we examine the intuition that TD( ) operates by approximating asynchronous value iteration. We note that on the important subclass of acyclic tasks, value iteration is ine cient compared with another graph algorithm, DAG-SP, wh...
متن کاملRobust adaptive control of voltage saturated flexible joint robots with experimental evaluations
This paper is concerned with the problem of design and implementation a robust adaptive control strategy for flexible joint electrically driven robots (FJEDR), while considering to the constraints on the actuator voltage input. The control design procedure is based on function approximation technique, to avoid saturation besides being robust against both structured and unstructured uncertaintie...
متن کاملAn Alternative Stability Proof for Direct Adaptive Function Approximation Techniques Based Control of Robot Manipulators
This short note points out an improvement on the robust stability analysis for electrically driven robots given in the paper. In the paper, the author presents a FAT-based direct adaptive control scheme for electrically driven robots in presence of nonlinearities associated with actuator input constraints. However, he offers not suitable stability analysis for the closed-loop system. In other w...
متن کاملAn Alternative Stability Proof for Direct Adaptive Function Approximation Techniques Based Control of Robot Manipulators
This short note points out an improvement on the robust stability analysis for electrically driven robots given in the paper. In the paper, the author presents a FAT-based direct adaptive control scheme for electrically driven robots in presence of nonlinearities associated with actuator input constraints. However, he offers not suitable stability analysis for the closed-loop system. In other w...
متن کامل